42 research outputs found
Generating expository dialogue from monologue: Motivation, corpus and preliminary rules
Generating expository dialogue from monologue is a task that poses an interesting and rewarding challenge for Natural Language Processing. This short paper has three aims: firstly, to motivate the importance of this task, both in terms of the benefits of expository dialogue as a way to present information and in terms of potential applications; secondly, to introduce a parallel corpus of monologues and dialogues which enables a data-driven approach to this challenge; and, finally, to describe work-in-progress on semi-automatic construction of Monologueto-Dialogue (M2D) generation rules
Constructing the CODA corpus: A parallel corpus ofmonologues and expository dialogues
We describe the construction of the CODA corpus, a parallel corpus of monologues and expository dialogues. The dialogue part of the corpus consists of expository, i.e., information-delivering rather than dramatic, dialogues written by several acclaimed authors. The monologue part of the corpus is a paraphrase in monologue form of these dialogues by a human annotator. The corpus was constructed as a resource for extracting rules for automated generation of dialogue from monologue. Using authored dialogues allows us to analyse the techniques used by accomplished writers for presenting information in the form of dialogue. The dialogues are annotated with dialogue acts and the monologues with rhetorical structure. We developed annotation and translation guidelines together with a custom-developed tool for carrying out translation, alignment and annotation
Recommended from our members
Question generation in the CODA project
In the ongoing CODA project, we are developing a system for automatically converting monologue into dialogue. The dialogue is generated in a two-step approach. Firstly, snippets of input monologue are mapped to dialogue act sequences. Secondly, these sequences are verbalized. The conversion relies partly on analysing input monologue in terms of its discourse relations. This short paper briefly describes the approach to the first step in CODA. This approach involves the use of a parallel corpus of monologues and dialogues to learn mappings from monologue to dialogue acts. Here, we focus on dialogue acts that involve question asking
Recommended from our members
Harvesting re-usable high-level rules for expository dialogue generation
This paper proposes a method for extracting high-level rules for expository dialogue generation. The rules are extracted from dialogues that have been authored by expert dialogue writers. We examine the rules that can be extracted by this method, focusing on whether different dialogues and authors exhibit different dialogue styles
Data-oriented monologue-to-dialogue generation
This short paper introduces an implemented and evaluated monolingual Text-to-Text generation system. The system takes monologue and transforms it to two-participant dialogue. After briefly motivating the task of monologue-to-dialogue generation, we describe the system and present an evaluation in terms of fluency and accuracy
Concept Type Prediction and Responsive Adaptation in a Dialogue System
Responsive adaptation in spoken dialog systems involves a change in dialog system behavior in response to a user or a dialog situation. In this paper we address responsive adaptation in the automatic speech recognition (ASR) module of a spoken dialog system. We hypothesize that information about the content of a user utterance may help improve speech recognition for the utterance. We use a two-step process to test this hypothesis: first, we automatically predict the task-relevant concept types likely to be present in a user utterance using features from the dialog context and from the output of first-pass ASR of the utterance; and then, we adapt the ASR's language model to the predicted content of the user's utterance and run a second pass of ASR. We show that: (1) it is possible to achieve high accuracy in determining presence or absence of particular concept types in a post-confirmation utterance; and (2) 2-pass speech recognition with concept type classification and language model adaptation can lead to improved speech recognition performance for post-confirmation utterances
Recommended from our members
QTMM2012c+: A Queryable Empirically- Grounded Resource of Dialogue with Argumentation
This paper introduces QTMM2012c+, a resource which links relations between propositions (inference, conflict and rephrase) to dialogue act sequences. QTMM2012c+ builds on the MM2012c annotated corpus of BBC Moral Maze debates, extending it with new annotations – for speaker roles (chair, panellists and witnesses), speaker stances (neutral, pro and con) and locution chronological ordering – and making the information available in a queryable format. We show how the new resource allows for: i) automatic extraction of empirically-grounded dialogue rules which describe choice and frequency of dialogue acts with specific argumentative functions given the dialogue history, and ii) extraction of generation templates that reflect naturally-occurring argumentative locutions in empirically-grounded dialogue. QTMM2012c+ facilitates automatic analysis of argument transitions between speakers, extending previous manual analysis of the MM2012c corpus, enabling empirical tests of theories of argumentative dialogue
The First Question Generation Shared Task Evaluation Challenge
The paper briefly describes the First Shared Task Evaluation Challenge on Question Generation that took place in Spring 2010. The campaign included two tasks: Task A – Question Generation from Paragraphs and Task B – Question Generation from Sentences. An overview of each of the tasks is provided
Recommended from our members
Discourse annotation - Towards a dialogue system for pair programming
Le développement de systèmes de dialogue a fait l’objet d’une grande attention dans différents domaines. Avec les progrès récents des tâches de traitement du langage de programmation, les systèmes de dialogue destinés aux programmeurs deviennent un autre domaine d’application viable. Cependant, afin de développer un système de dialogue pour assister les programmeurs, il est nécessaire de traiter non seulement le code, mais aussi le langage naturel associé. Comment ces données doivent-elles être annotées ? Dans cet article, nous présentons une synthèse des méthodes les plus courantes d’annotation des dialogues, avec un accent particulier sur le domaine de la programmation. On considère d’abord les théories sur lesquelles ces méthodes sont basées, on énumère les principales méthodes et on analyse les particularités du domaine de la programmation et dans quelle mesure les principales méthodes d’annotation sont adaptées à ce domaine.
Much work has been carried out on dialogue system development in different fields. With recent advances in Programming Language Processing tasks, dialogue systems aimed at programmers are becoming another viable area of application. However, the data necessary for a dialogue system that can assist programmers involves not only code, but the natural language around it. How should this data be annotated? In this review we examine the most common approaches to dialogue annotation, paying special attention to programming settings. We first look at the broader theories that inform these approaches, and after our review of the most widely used annotation schemes we analyze the peculiarities of the programming context and how well suited the existing schemes are for this setting